1 SYSTEM AND METHOD FOR QUALITY OF SERVICE BASED SERVER CLUSTER 

2 POWER MANAGEMENT 
3 

4 BACKGROUND OF THE INVENTION 

5 1. Field of the Invention 

6 The present invention relates generally to systems and methods for server cluster 

7 power management, and more particularly for quality of service based server cluster 

8 power management. 

9 2. Discussion of Background Art 

10 A modern trend in network management is to an "always-on" model. Such a 

11 model recognizes the pervasiveness of computers and information within everyday 

12 business and personal activities. 

13 To manage such growing demands, large data centers consisting of many clients 

14 and servers are networked together in clusters. Such clusters may be configured to 

15 provide various redundant and high availability processes and services. Unfortunately 

16 however, such clusters are still susceptible to power outages, which can bring all network 

17 traffic to a halt. 

18 Figure 1 is a block diagram of a conventional server cluster system 100 both 

19 before and after a power interruption at time T 0 . The conventional cluster 100 includes 

20 four servers 102-108, coupled respectively to four Uninterruptible Power Supplies (UPSs) 

21 1 10-1 16, and which receive standard wall outlet power over line 118. Each UPS typically 

22 contains a battery backup (not shown) which provides power to its respective server upon 

23 detection of a power interruption and for a period thereafter until the batteries are 

24 exhausted. 
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1 As shown in Figure 1, at time T 0 , all four servers 102-108 are fully operational. 

2 However, if a power interruption occurs at time T 0 , there is a complete failure of the 

3 server cluster at time Tj, when the UPS batteries have been exhausted. Thus all processes 

4 supported by the servers 102-108 are terminated and the network is down. Such a 

5 complete failure is indiscriminant of the importance of any traffic passing through or 

6 processes being executed by the servers, and is very much an "all or nothing" power 

7 management design. Such designs fall short of client expectations and network demands 

8 in this modern era. 

9 In response to the concerns discussed above, what is needed is a system and 

10 method for server cluster power management that overcomes the problems of the prior 

11 art. 
12 
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1 SUMMARY OF THE INVENTION 

2 The present invention is a system and method for Quality of Service (QoS) based 

3 server cluster power management. The method of the present invention includes the steps 

4 of: grouping activities within a server cluster into predefined sets; assigning a priority 

5 level to each set; identifying a first server hosting a first set of lower-priority activities 

6 within the cluster; receiving a power interruption signal; and diverting power reserves of 

7 the first server to another server in the cluster, in response to the power interruption 

8 signal. 

9 The system of the present invention includes: servers, hosting a plurality of 

10 activity sets each having an associated QoS level; power reserves coupled to the servers; a 

11 switch matrix coupled to direct the power reserves between the servers; and a power 

12 manager, coupled to the switch matrix, for commanding the switch matrix to divert power 

13 from servers hosting low QoS activity sets to servers hosting high-priority activity sets, in 

14 response to a power interruption. 

15 The system and method of the present invention are particularly advantageous 

16 over the prior art because QoS concepts are applied to server cluster power management. 

17 These and other aspects of the invention will be recognized by those skilled in the art 

18 upon review of the detailed description, drawings, and claims set forth below. 
19 
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1 BRIEF DESCRIPTION OF THE DRAWINGS 

2 Figure 1 is a block diagram of a conventional server cluster system; 

3 Figure 2 is a block diagram of a Quality of Service (QoS) based server cluster 

4 power management system; 

5 Figure 3 is a flowchart of a method for Quality of Service based server cluster 

6 power management; 

7 Figure 4 is a block diagram of one of many possible ways to manage power in the 

8 server cluster in response to a power interruption; 

9 Figure 5 is a graph of how a power interruption affects available server cluster 

10 power in both the QoS based system and the conventional server cluster system; and 

11 Figure 6 is a graph of how a power interruption affects QoS in both the QoS based 

12 system and the conventional server cluster system. 
13 
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1 DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 

2 Figure 2 is a block diagram of a Quality of Service (QoS) based server cluster 

3 power management system 200. The system 200, shown in just one of many possible 

4 embodiments, includes servers 1 through 4 (202-208), each provided power by 

5 Uninterruptible Power Supplies (UPSs) 1 through 4 (210-216) respectively. A standard 

6 power line 218 provides wall outlet power to each of the UPSs 210-216. Batteries within 

7 each UPS are connected to a power divert line 220. The power divert line 220 is coupled 

8 to a switch matrix 222 which can divert battery power from a set of UPSs to any other set 

9 of UPSs. A power manager 224 software module, executing power management 

10 algorithms, is coupled to the switch matrix 222 and the UPSs 210-216 by a System 

11 Network Management Protocol (SNMP) line 226, and to the servers 202-208 by a Quality 

12 of Service (QoS) line 226. The power manager 224 and the switch matrix 222 are 

13 preferably housed in a power controller 230. Together these elements make up a server 

14 cluster network. Operation of the system 200 is discussed in Figure 3. 
15 

16 Figure 3 is a flowchart of a method 300 for Quality of Service (QoS) based server 

17 cluster power management. Quality of Service (QoS) is a standard phrase originating 

18 from an idea that client-server network performance, such as transmission and error rates, 

19 can be managed in real time. And, while such QoS concepts have been applied to 

20 network packet switching and data management, they have not been applied to diverting 

21 power between different servers within a server cluster. 

22 The method begins in step 302, where a network administrator groups server 

23 activities into predefined sets. The predefined sets are defined by the network 

24 administrator depending upon how the administrator intends to manage power reserves 
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1 within the network after a power interruption occurs. Examples of such predefined sets 

2 include: types of data transmitted by each of the servers 202-208 over the network; 

3 processes and applications, redundant or otherwise, executing on each of the servers 202- 

4 208; or any other useful differentiation of activity on the servers 202-208. Data types 

5 include: voice, video, and bulk data. Processes and applications include: e-mail, word 

6 processing, virus detection, firewalls, daemons, as well as many others. 

7 In step 304, the network administrator assigns a QoS level to each set. Activity 

8 sets assigned a higher QoS can also be thought of as having a higher operational priority 

9 level. In step 306, the power manager 224 monitors server activities and the QoS level 

10 assigned to each set of server activity over QoS line 228. QoS levels are transmitted over 

11 the QoS line 228 preferably follow a Common Open Policy Service Protocol (COPS). 

12 COPS is a protocol for exchanging QoS information over a network. COPS protocols are 

13 discussed in an Internet-Draft working document generated by the Internet Engineering 

14 Task Force (IETF). In step 308, the power manager 224 generates a priority list, 

15 organizing server activities based on their assigned QoS levels. 

16 In step 310, one or more of the UPSs 210-216 detect a power interruption on the 

17 standard power line 218. In response, a power interruption signal is sent from the UPS's 

18 210-216 to the power manager 224 over the SNMP line 226, in step 3 12. Next, in step 

19 3 14, the power manager 224 sends a server shutdown command to one or more of the 

20 UPSs 210-216 over the SNMP line 226. 

21 The power manager 224 selects which of the servers 202-208 to shutdown based 

22 on the priority list. How exactly the shutdown selections are made, however, is 

23 dependent upon how the network administrator programs the power manager 224 to 

24 respond to the power interruption signal. For example, the network administrator can 
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1 program the power manager 224 to identify the server hosting an activity which is highest 

2 on the priority list and shutdown all other servers. Or, the network administrator can 

3 program the power manager 224 to identify the top five activities on the priority list, 

4 command the servers 202-208 to inactivate all other activities on the priority list and 

5 transfer those five highest priority activities to a single server and shutdown the other 

6 servers. Thus, cluster power management is under full control of the network 

7 administrator. Those skilled in the art will also recognize that the present invention 

8 provides an ability to divert power between servers for reasons not even related to power 

9 interruptions, but instead for any power management reason. 

10 In step 316, the power manager 224 sends a divert battery power command to the 

11 switch matrix 222, directing the matrix 222 to reroute reserve battery power from those 

12 UPSs sent the server shutdown command to those UPSs powering those servers which 

13 remain operational. After step 316, the method 300 ends. 
14 

15 Figure 4 is a block diagram 400 of one of many possible ways to manage power in 

16 the server cluster in response to the power interruption on the standard power line 218. 

17 In the Figure, the power manager 224 has commanded: UPS 2 212 to shutdown server 2 

18 204, UPS 3 214 to shutdown server 3 206, UPS 2 216 to shutdown server 2 208, and the 

19 switch matrix 222 to route reserve battery power from UPSs 1, 2 and 3 (212, 214, and 

20 216) to UPS 1 210 so that server 1 202 can be kept operational for as long as possible 

21 during the power interruption. 
22 

23 Figure 5 is a graph 500 of how a power interruption, at time T 0 , affects available 

24 server cluster power 502 in both the QoS based system 200 and the conventional server 
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1 cluster system 100. As shown by curve 504, when a power interruption occurs at time T 0 

2 in the conventional system 100, a step- wise complete power failure of servers 1 through 4 

3 (102-108) occurs at time Ti, as battery reserves in the conventional system's 100 UPSs 

4 1 10-1 16 are exhausted all at about the same time. Total system 100 battery reserves are 

5 equal to an area under curve 502. 

6 In contrast, as shown by curve 506, when a power interruption occurs at time To in 

7 the QoS based system 200 and servers 2 through 4 (204-208) are shutdown and battery 

8 reserves in UPSs 212-216 are diverted to server 1 202, server l's 202 time of operation is 

9 extended to a time T 2 , which is far beyond time Ti . 

10 Thus while total QoS system 200 battery reserves (equal to an area under curve 

11 504) are equal to total conventional system 100 battery reserves, the present invention 

12 manages that same limited reserve of battery power so that server l's 202 operation may 

13 be extended until time T 2 . As a result, those activities highest on the priority list may 

14 continue servicing the cluster network beyond that of conventional systems 100. 
15 

16 Figure 6 is a graph 600 of how a power interruption, at time To, affects QoS 602 

17 in both the QoS based system 200 and the conventional server cluster system 100. As 

18 shown by curve 604, when a power interruption occurs at time To in the conventional 

19 system 100, a step-wise complete shutdown of all activities on servers 1 through 4 (102- 

20 108) occurs at time Ti, as battery reserves in the conventional system's 100 UPSs 1 10- 

21 1 16 are exhausted all at about the same time. 

22 In contrast, as shown by curve 606, when a power interruption occurs at time To in 

23 the QoS based system 200 and servers 2 through 4 (204-208) are shutdown and battery 

24 reserves in UPSs 212-216 are diverted to server 1 202, server l's 202 overall Quality of 
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1 Service for hosted high-priority activities is extended until time T2. The curve 606 also 

2 shows that, depending upon how QoS, is measured QoS may initially dip below QoS for 

3 the conventional system 100, at time Tx, QoS is basically maintained at a constant level 

4 all the way until time T Y , in the QoS based system 200. Depending upon how the 

5 network administrator configures the power manager 224, the initial dip can be due to a 

6 shutdown of lower-priority activities that can not be maintained on server 1 202, while 

7 the conventional system 100 continues to host all activities. The somewhat graceful 

8 decline in QoS from time To until T2 is again determined by how the network 

9 administrator configures the power manager 224, and can be due to the power manager 

10 224 incrementally shutting down lower-priority server activities as power reserves 

11 dwindle. 
12 

13 While one or more embodiments of the present invention have been described, 

14 those skilled in the art will recognize that various modifications may be made. Variations 

15 upon and modifications to these embodiments are provided by the present invention, 

16 which is limited only by the following claims. 
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